Running kns_committeesession pipeline

Run dependant pipelines


In [32]:
!{'cd /pipelines; KNESSET_LOAD_FROM_URL=1 dpp run ./committees/kns_documentcommitteesession >/dev/null'}


INFO    :RESULTS:
INFO    :SUCCESS: ./committees/kns_documentcommitteesession {'bytes': None, 'count_of_rows': 89695, 'dataset_name': '_', 'hash': 'ac64b23b6f5d861c6af83a1a48d9857f'}

In [33]:
!{'cd /pipelines; KNESSET_LOAD_FROM_URL=1 dpp run ./committees/kns_committee >/dev/null'}


INFO    :RESULTS:
INFO    :SUCCESS: ./committees/kns_committee {'bytes': None, 'count_of_rows': 756, 'dataset_name': '_', 'hash': '42785ce515755831aa4293685710da51'}

In [34]:
!{'cd /pipelines; KNESSET_LOAD_FROM_URL=1 dpp run ./committees/kns_cmtsessionitem >/dev/null'}


INFO    :RESULTS:
INFO    :SUCCESS: ./committees/kns_cmtsessionitem {'bytes': None, 'count_of_rows': 47595, 'dataset_name': '_', 'hash': '049808f2f8584198e962cb3585e7c970'}

In [35]:
!{'cd /pipelines; KNESSET_LOAD_FROM_URL=1 dpp run ./bills/kns_bill >/dev/null'}


INFO    :RESULTS:
INFO    :SUCCESS: ./bills/kns_bill {'bytes': None, 'count_of_rows': 44393, 'dataset_name': '_', 'hash': '0b09096c9da4ed1b8084e77e36db3f68'}

Run the kns_committeesession pipeline

Following runs the full pipeline pre/post steps but the actual kns_committeesession is loaded from storage rather then from the knesset API:


In [44]:
!{'cd /pipelines; DATASERVICE_LOAD_FROM_URL=1 dpp run --verbose ./committees/kns_committeesession'}


[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 RUNNING ./committees/kns_committeesession
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 Collecting dependencies
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 Running async task
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 Waiting for completion
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 Async task starting
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 Searching for existing caches
[./committees/kns_committeesession:T_0] >>> INFO    :Found cache for step 8: join
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 Building process chain:
[./committees/kns_committeesession:T_0] >>> INFO    :- cache_loader
[./committees/kns_committeesession:T_0] >>> INFO    :- join_session_bills
[./committees/kns_committeesession:T_0] >>> INFO    :- knesset.dump_to_path
[./committees/kns_committeesession:T_0] >>> INFO    :- knesset.dump_to_sql
[./committees/kns_committeesession:T_0] >>> INFO    :- (sink)
[./committees/kns_committeesession:T_0] >>> INFO    :join_session_bills: INFO    :Processed 74457 rows
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 DONE /usr/local/lib/python3.6/site-packages/datapackage_pipelines/specs/../lib/cache_loader.py
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 DONE /pipelines/committees/join_session_bills.py
[./committees/kns_committeesession:T_0] >>> INFO    :knesset.dump_to_path: INFO    :Processed 74457 rows
[./committees/kns_committeesession:T_0] >>> INFO    :knesset.dump_to_sql: INFO    :Processed 74457 rows
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 DONE /usr/local/lib/python3.6/site-packages/datapackage_pipelines/manager/../lib/internal/sink.py
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 DONE /pipelines/datapackage_pipelines_knesset/processors/dump_to_path.py
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 DONE /pipelines/datapackage_pipelines_knesset/processors/dump_to_sql.py
[./committees/kns_committeesession:T_0] >>> INFO    :96b5eac3 DONE V ./committees/kns_committeesession {'.dpp': {'out-datapackage-url': '../data/committees/kns_committeesession/datapackage.json'}, 'bytes': None, 'count_of_rows': 74457, 'dataset_name': '_', 'hash': '9eff39d663f87ea8b2b077f82b1735af', 'num_bills': 44393, 'num_sessions_not_related_to_legislation': 39613, 'num_sessions_related_to_legislation': 34844}
INFO    :RESULTS:
INFO    :SUCCESS: ./committees/kns_committeesession {'bytes': None, 'count_of_rows': 74457, 'dataset_name': '_', 'hash': '9eff39d663f87ea8b2b077f82b1735af', 'num_bills': 44393, 'num_sessions_not_related_to_legislation': 39613, 'num_sessions_related_to_legislation': 34844}

Inspect the output


In [45]:
from dataflows import Flow, load, printer
sessions = Flow(load('/pipelines/data/committees/kns_committeesession/datapackage.json')).results()[0][0]

In [46]:
sessions = {s['CommitteeSessionID']: s for s in sessions}

In [47]:
for session in sessions.values():
    if session['item_ids']:
        break
session


Out[47]:
{'CommitteeSessionID': 64990,
 'Number': None,
 'KnessetNum': 15,
 'TypeID': 161,
 'TypeDesc': 'פתוחה',
 'CommitteeID': 25,
 'Location': 'חדר הוועדה, באגף קדמה, קומה 1, חדר 1720',
 'SessionUrl': 'http://main.knesset.gov.il/Activity/committees/Pages/AllCommitteesAgenda.aspx?Tab=3&ItemID=64990',
 'BroadcastUrl': None,
 'StartDate': datetime.datetime(2002, 6, 12, 9, 0),
 'FinishDate': None,
 'Note': None,
 'LastUpdatedDate': datetime.datetime(2011, 4, 12, 5, 28, 59),
 'download_crc32c': None,
 'download_filename': None,
 'download_filesize': None,
 'parts_crc32c': None,
 'parts_filesize': None,
 'parts_parsed_filename': None,
 'text_crc32c': None,
 'text_filesize': None,
 'text_parsed_filename': None,
 'topics': ['חוק הבחירות לכנסת (תיקון מס\' 52), התשס"ד-2004'],
 'committee_name': 'החוקה, חוק ומשפט',
 'item_ids': [17755],
 'item_type_ids': [2],
 'bill_names': ['חוק הבחירות לכנסת (תיקון מס\' 52), התשס"ד-2004'],
 'bill_types': ['פרטית'],
 'related_to_legislation': True}